Eecient Handling of Large Sets of Tuples with Sharing Trees Eecient Handling of Large Sets of Tuples with Sharing Trees
نویسندگان
چکیده
Computing with sets of tuples (n-ary relations) is often required in programming. This paper presents a new data structure dedicated to the manipulation of large sets of tuples, dubbed sharing tree. The main idea is to share in memory some sub-tuples of the set. We give algorithms for common set operations, that have theoretical complexities proportional to the sizes of the sharing trees given as arguments, which are usually much smaller than the sizes of the represented sets.
منابع مشابه
Distributed Algorithms for the Transitive Closure
Many database queries, such as reachability and regular path queries, can be reduced to finding the transitive closure of the underlying graph. For calculating the transitive closure of large graphs, a distributed computation framework is required to handle the large data volume (which can approach O(|V |) space). Map Reduce was not originally designed for recursive computations, but recent wor...
متن کاملSharply $(n-2)$-transitive Sets of Permutations
Let $S_n$ be the symmetric group on the set $[n]={1, 2, ldots, n}$. For $gin S_n$ let $fix(g)$ denote the number of fixed points of $g$. A subset $S$ of $S_n$ is called $t$-emph{transitive} if for any two $t$-tuples $(x_1,x_2,ldots,x_t)$ and $(y_1,y_2,ldots ,y_t)$ of distinct elements of $[n]$, there exists $gin S$ such that $x_{i}^g=y_{i}$ for any $1leq ileq t$ and additionally $S$ is called e...
متن کاملCompression-Based Discretization of Continuous Attributes
Discretization of continuous attributes into ordered discrete attributes can be beneecial even for propositional induction algorithms that are capable of handling continuous attributes directly. Beneets include possibly large improvements in induction time, smaller sizes of induced trees or rule sets, and even improved predictive accuracy. We deene a global evaluation measure for discretization...
متن کاملComparison of Ordinal Response Modeling Methods like Decision Trees, Ordinal Forest and L1 Penalized Continuation Ratio Regression in High Dimensional Data
Background: Response variables in most medical and health-related research have an ordinal nature. Conventional modeling methods assume predictor variables to be independent, and consider a large number of samples (n) compared to the number of covariates (p). Therefore, it is not possible to use conventional models for high dimensional genetic data in which p > n. The present study compared th...
متن کاملText database systems
A text database system, often called an information retrieval system, is designed to process a text model of the data, viewed as an ordered sequence of documents, paragraphs, sentences, words (i.e., as a list structure). Although relations are sets of tuples, and therefore unordered, the relational model can still be used successfully for text, but surprisingly it is shown that at the physical ...
متن کامل